Nonasymptotic quasi-optimality of AIC and the slope heuristics in maximum likelihood estimation of density using histogram models

نویسندگان

  • Adrien Saumard
  • A. Saumard
چکیده

We consider nonparametric maximum likelihood estimation of density using linear histogram models. More precisely, we investigate optimality of model selection procedures via penalization, when the number of models is polynomial in the number of data. It turns out that the Slope Heuristics …rst formulated by Birgé and Massart [10] is satis…ed under rather mild conditions on the density to be estimated and the structure of the considered partitions. This suggests a new look at AIC penalty and more precisely, we show that the minimal penalty in the sense of Birgé and Massart is equivalent to half AIC penalty. Thus, as soon as the chosen penalty is larger than half AIC, the model selection procedure satis…es an oracle inequality. On contrary, if the penalty is less than the minimal one, then the procedure totally misbehaves. Moreover, if the penalty is equal to AIC penalty and the number of data is large enough -, then the model selection procedure is nearly optimal in the sense that it satis…es a nonasymptotic, trajectorial oracle inequality with constant almost one, tending to one when the number of data goes to in…nity. Finally, it is, to our knowledge, the …rst time that the Slope Heuristics is theoritically validated in a non-quadratic setting. Keywords: Maximum likelihood, density estimation, AIC, Optimal model selection, Slope heuristics, Penalty calibration. 1 Introduction This paper is devoted to the study of some penalized maximum likelihood model selection procedures for the estimation of density on histograms. There is a huge amount of literature on the problem of model selection by penalized maximum likelihood criteria, even in the more restrictive question of selecting an histogram, that goes back to Akaike’s pioneer work. In the early seventies, Akaike [1] proposed to select a model by penalizing the empirical likelihood of maximum likelihood estimators by the number of parameters in each model. The analysis of Akaike [1] on the model selection procedure de…ned by the so-called Akaike’s Information Criterion (AIC), is fundamentally asymptotic in the sense that the author considers a given …nite collection of models with the number of data going to in…nity. This asymptotic setting is irrelevant in many situations and thus many e¤orts have been made to develop nonasymptotic analysis of model selection procedures, letting the dimension of the models and the cardinality of the collection of models depend on the number of data. As pointed out by Boucheron and Massart [11], it is nevertheless worth mentioning that early works of Akaike [2] and Mallows [22] in model selection relied, although in a disguised form, on the Wilks’ phenomenon (Wilks [28]) that asserts that in smooth parametric density estimation the di¤erence between the maximum likelihood and the likelihood of the sampling distribution converges towards a chi-square distribution where the number of degrees of freedom coincides with the model dimension. This phenomenon has been generalized by Boucheron and Massart [11] in a nonasymptotic way, considering the empirical excess risk in a M-estimation with bounded contrast setting, and is actually one the main results supporting the conjecture that the slope heuristics introduced by Birgé and Massart [10] hold in some general framework, see Arlot and Massart [6]. Let us now describe some works related to the selection of maximum likelihood estimators. Barron and Sheu [9] give some risks bounds on maximum likelihood estimation considering sequences of regular exponential families made of polynomials, splines and trigonometric series. They achieve an accurate

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Development of Maximum Likelihood Estimation Approaches for Adaptive Estimation of Free Speed and Critical Density in Vehicle Freeways

The performance of many traffic control strategies depends on how much the traffic flow models have been accurately calibrated. One of the most applicable traffic flow model in traffic control and management is LWR or METANET model. Practically, key parameters in LWR model, including free flow speed and critical density, are parameterized using flow and speed measurements gathered by inductive ...

متن کامل

The Development of Maximum Likelihood Estimation Approaches for Adaptive Estimation of Free Speed and Critical Density in Vehicle Freeways

The performance of many traffic control strategies depends on how much the traffic flow models are accurately calibrated. One of the most applicable traffic flow model in traffic control and management is LWR or METANET model. Practically, key parameters in LWR model, including free flow speed and critical density, are parameterized using flow and speed measurements gathered by inductive loop d...

متن کامل

Change Point Estimation of the Stationary State in Auto Regressive Moving Average Models, Using Maximum Likelihood Estimation and Singular Value Decomposition-based Filtering

In this paper, for the first time, the subject of change point estimation has been utilized in the stationary state of auto regressive moving average (ARMA) (1, 1). In the monitoring phase, in case the features of the question pursue a time series, i.e., ARMA(1,1), on the basis of the maximum likelihood technique, an approach will be developed for the estimation of the stationary state’s change...

متن کامل

تخمین احتمال بزرگی زمین‌لغزش‌های رخ‌داده در حوزه آبخیز پیوه‌ژن (استان خراسان رضوی)

Knowing the number, area, and frequency of landslides occurred in each area has a prominent role in the long-term evolution of area dominated by landslides and can be used for analyzing of susceptibility, hazard, and risk. In this regard, the current research is trying to consider identified landslides size probability in the Pivejan Watershed, Razavi Khorasan Province. In the first step, lands...

متن کامل

Modified Maximum Likelihood Estimation in First-Order Autoregressive Moving Average Models with some Non-Normal Residuals

When modeling time series data using autoregressive-moving average processes, it is a common practice to presume that the residuals are normally distributed. However, sometimes we encounter non-normal residuals and asymmetry of data marginal distribution. Despite widespread use of pure autoregressive processes for modeling non-normal time series, the autoregressive-moving average models have le...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017